xvalDapc
performs stratified cross-validation of DAPC
using varying numbers of PCs (and keeping the number of discriminant
functions fixed); xvalDapc
is a generic with methods for
data.frame
and matrix
.xvalDapc(x, grp, n.pca.max = 300, n.da = NULL,
training.set = 0.9, result = c("groupMean", "overall"),
center = TRUE, scale = FALSE,
n.pca=NULL, n.rep = 30, xval.plot = TRUE, ...)
## S3 method for class 'data.frame':
xvalDapc(x, grp, n.pca.max = 300, n.da = NULL,
training.set = 0.9, result = c("groupMean", "overall"),
center = TRUE, scale = FALSE,
n.pca=NULL, n.rep = 30, xval.plot = TRUE, ...)
## S3 method for class 'matrix':
xvalDapc(x, grp, n.pca.max = 300, n.da = NULL,
training.set = 0.9, result = c("groupMean", "overall"),
center = TRUE, scale = FALSE,
n.pca=NULL, n.rep = 30, xval.plot = TRUE, ...)
a data.frame
or a matrix
used as input of DAPC.factor
indicating the group membership of
individuals.integer
indicating the number of axes retained in the
Discriminant Analysis step. If NULL
, n.da defaults to 1 less than
the number of groups.logical
indicating whether variables should be centred to
mean 0 (TRUE, default) or not (FALSE). Always TRUE for logical
indicating whether variables should be scaled
(TRUE) or not (FALSE, default). Scaling consists in dividing variables by their
(estimated) standard deviation to account for trivial differences in
variances.integer
vector indicating the number of
different number of PCA axes to be retained for the cross
validation; if NULL
, this will be dertermined automatically.boot
.
see Details.list
containing seven items, and a plot
of the results. The
first is a data.frame
with two columns, the first giving the number of
PCs of PCA retained in the corresponding DAPC, and the second giving the
proportion of successful group assignment for each replicate. The second item
gives the mean and confidence interval for random chance. The third gives the
mean successful assignment at each level of PC retention. The fourth indicates
which number of PCs is associated with the highest mean success. The fifth
gives the Root Mean Squared Error at each level of PC retention. The sixth
indicates which number of PCs is associated with the lowest MSE. The seventh
item contains the DAPC carried out with the optimal number of PCs, determined
with reference to MSE.
If xval.plot=TRUE
a scatterplot of the results of cross-validation
will be displayed.boot
. If you have a modern computer, it is
likely that you have multiple cores on your system. R by default utilizes
only one of these cores unless you tell it otherwise. For details, please
see the documentation of boot
. Basically, if you want to
use multiple cores, you need two arguments:
parallel
- what R parallel system to use (see below)ncpus
- number of cores you want to useparallel = "multicore"
. If you are on Windows, you will want to
specify parallel = "snow"
.
}dapc
## CROSS-VALIDATION ##
data(sim2pop)
xval <- xvalDapc(sim2pop@tab, pop(sim2pop), n.pca.max=100, n.rep=3)
xval
## 100 replicates ##
# Serial version (SLOW!)
system.time(xval <- xvalDapc(sim2pop@tab, pop(sim2pop), n.pca.max=100, n.rep=100))
# Parallel version (faster!)
system.time(xval <- xvalDapc(sim2pop@tab, pop(sim2pop), n.pca.max=100, n.rep=100,
parallel = "multicore", ncpus = 2))
Run the code above in your browser using DataLab